Bitcoin vs. Gold in Crisis: Evaluating Volatility Dynamics During COVID-19 (Time Series Modelling)

Objectives:

This study investigates the volatility dynamics of Bitcoin and Gold returns before and during the COVID-19 pandemic. The study applies ARMA and EGARCH models to daily return data. The sample is divided into two periods, pre-COVID and COVID-19, excluding the transitional crash period in early March 2020.

This document provides the R code and a walkthrough for the modelling process, using the Bitcoin pre-COVID period as the primary example. For the Gold series (GC=F) and for the COVID-19 period, the ticker and time periods in the code should be updated accordingly.

Keywords: Bitcoin, Gold, COVID-19, Volatility, Time series, EGARCH, Financial crisis, Leverage effect

(a) Scrape Data from Yahoo Finance

# Bitcoin pre-COVID 
btc_data <- getSymbols("BTC-USD", src = "yahoo"
                       , from = "2018-01-01", to = "2020-02-28" 
                       # during COVID from = "2020-03-13", to = "2022-04-30"
                       , auto.assign = FALSE)

# gold_data <- getSymbols("GC=F", src = "yahoo"
#                        , from = "2018-01-01", to = "2020-02-28"
#                        , auto.assign = FALSE)

(b) Compute the log-return (using log-difference)

Raw asset prices are typically non-stationary. To prepare the data for time-series modeling, we compute continuously compounded log returns. This transformation is an essential step to achieve stationarity by removing the stochastic price trend.

btc_returns <- diff(log(btc_data$"BTC-USD.Close"), lag = 1) %>%  na.omit()
colnames(btc_returns) <- "btc_returns"
# (optional) convert return to an xts object using xts()

(c) Plot Bitcoin returns and Perform pre-estimation checks

# Plotting returns using ggplot2
ggplot(btc_returns, aes(x = Index, y = btc_returns)) +
  geom_line() +
  labs(title = "Bitcoin Returns (BTC-USD)",
       x = "Date",
       y = "Returns") +
  theme_minimal()

# ADF test for nonstationarity
adf.test(btc_returns)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  btc_returns
## Dickey-Fuller = -8.7292, Lag order = 9, p-value = 0.01
## alternative hypothesis: stationary

The plot suggests the transformed series is stationary, a property known as volatility clustering is visually evident. The stationarity of log returns is formally confirmed using the Augmented Dickey-Fuller (ADF) test. Since the p-value is less than 0.05, we reject the null hypothesis of non-stationarity.

# Descriptive Data
summary(btc_returns$btc_returns)

##      Index             btc_returns        
##  Min.   :2018-01-02   Min.   :-0.1845817  
##  1st Qu.:2018-07-17   1st Qu.:-0.0164121  
##  Median :2019-01-30   Median : 0.0009966  
##  Mean   :2019-01-30   Mean   :-0.0005763  
##  3rd Qu.:2019-08-15   3rd Qu.: 0.0157544  
##  Max.   :2020-02-28   Max.   : 0.1600420

# sd(btc_returns); skewness(btc_returns); kurtosis(btc_returns)
# Test for normality: Ho: data is normally distributed
jarque.bera.test(btc_returns)

## 
##  Jarque Bera Test
## 
## data:  btc_returns
## X-squared = 329.04, df = 2, p-value < 2.2e-16

Subsequently, descriptive statistics are calculated to investigate the properties of Bitcoin returns. The Jarque-Bera (JB) test formally shows that Bitcoin returns clearly deviate from normality (leptokurtic or “fat-tailed” distribution), which is a key stylized fact of financial returns. This non-normality justifies the use of GARCH models and a non-normal error distribution (like the Generalized Error Distribution, GED).

(d) Select optimal ARMA model

Next, we inspect the Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) plots of the returns to suggest potential lags. To select the optimal specification, Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) are used to balance model goodness-of-fit with complexity.

# Plot the correlogram (ACF, PACF) to identify potential lags
ggAcf(btc_returns)

ggPacf(btc_returns)

# Choose optimal model
# Use auto.arima to suggest appropriate model
auto.arima(btc_returns, d = 0, ic = "aic",max.p = 10, max.q = 10)

## Series: btc_returns 
## ARIMA(1,0,0) with zero mean 
## 
## Coefficients:
##           ar1
##       -0.0292
## s.e.   0.0357
## 
## sigma^2 = 0.001489:  log likelihood = 1447.22
## AIC=-2890.43   AICc=-2890.42   BIC=-2881.09

# Besides, the suggested model from auto.arima.
# we manually try some models and compare information criteria
# Since data is stationary returns, d=0. Assumed no seasonality
order_list <- list(
  c(0, 0, 0), # White noise (baseline)
  c(1, 0, 0), # ARMA(1,1)
  c(1, 0, 1), # ARMA(1,1)
  c(1, 0, 2), # ARMA(1,2)
  c(2, 0, 1), # ARMA(2,1)
  c(4, 0, 0), # ARMA(4,1)
  c(4, 0, 4)  # ARMA(4,4)
)
# Fit model lists and evaluate AIC and BIC
results <- data.frame(p = integer(), q = integer(),  AIC = numeric(),  BIC = numeric())
for (order in order_list) {
  p <- order[1]
  q <- order[3]
  fit <- tryCatch({
    Arima(btc_returns, order = c(p, 0, q))
  }, error = function(e) return(NULL)
  )
  if (!is.null(fit)) {
    results <- rbind(results, data.frame(
      p = p,
      q = q,
      AIC = AIC(fit),
      BIC = BIC(fit)
    ))
  }
}
print(results)

##   p q       AIC       BIC
## 1 0 0 -2889.939 -2880.600
## 2 1 0 -2888.622 -2874.613
## 3 1 1 -2886.924 -2868.246
## 4 1 2 -2887.354 -2864.006
## 5 2 1 -2887.256 -2863.909
## 6 4 0 -2887.644 -2859.627
## 7 4 4 -2885.182 -2838.487

Among all models, ARMA(0,0) is chosen as the optimal model for Bitcoin returns pre-COVID, as it has the lowest AIC and BIC values. Theoretically, returns in efficient markets might follow a random walk with no autocorrelation, which is consistent with an ARMA(0,0) model.

(e) Estimate model for Bitcoin returns using ARMA(0,0)

# Fit model
arma_fit <- arima(btc_returns, order = c(0,0,0))
coeftest(arma_fit)

## 
## z test of coefficients:
## 
##              Estimate  Std. Error z value Pr(>|z|)
## intercept -0.00057629  0.00137458 -0.4192    0.675

The intercept (mean) is statistically insignificant, confirming the ARMA(0,0) specification is appropriate.

(f) Post-estimation checks and Justification for GARCH models

We conduct diagnostic checks on the ARMA model residuals to ensure its robustness and to test for remaining dependencies. First, we check the residuals from the ARMA estimation to see whether the model had successfully accounted for all significant temporal dependencies. Next,we check squared residuals, aiming to detect any temporal dependencies in the second moment (conditional variance).

# (1) Check the inverse roots: since this is arma (0,0) we can skip this
# (2) Check the residuals with Ljung-Box and JB test 
checkresiduals(arma_fit) # Ljung-Box Ho: There is autocorrelation in the residuals

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,0,0) with non-zero mean
## Q* = 13.698, df = 10, p-value = 0.1872
## 
## Model df: 0.   Total lags used: 10

resid <- residuals(arma_fit)
jarque.bera.test(resid)

## 
##  Jarque Bera Test
## 
## data:  resid
## X-squared = 329.04, df = 2, p-value < 2.2e-16

# (3) Conduct heteroskedasticity test on residuals series from arma mod
ArchTest(resid, lags = 10) # Ho: No heteroskedasticity issue

## 
##  ARCH LM-test; Null hypothesis: no ARCH effects
## 
## data:  resid
## Chi-squared = 56.577, df = 10, p-value = 1.6e-08

# (4) Check on squared residuals
resid_sq <- resid^2
ggtsdisplay(resid_sq)

The ARCH-LM test is applied to the residuals to check for conditional heteroskedasticity. The highly significant p-value indicates the presence of volatility clustering. The correlogram of the squared residuals further confirms this, showing strong memory and dependencies. Therefore, we must employ a GARCH-type model to capture this time-varying variance.

(g) Estimate a GARCH model

To determine the optimal GARCH specification, various GARCH and EGARCH models with orders up to (2,2) were compared. Model selection was based on minimizing information criteria (AIC, BIC).

# Manually try some models and compare information criteria
compare_garch_models <- function(btc_returns) {
  # Define model specifications
  models <- list(
    "GARCH(1,1)" = ugarchspec(variance.model = list(model = "sGARCH", garchOrder = c(1, 1)),
                              mean.model = list(armaOrder = c(0, 0)), distribution.model = "ged"),
    "GARCH(1,2)" = ugarchspec(variance.model = list(model = "sGARCH", garchOrder = c(1, 2)),
                              mean.model = list(armaOrder = c(0, 0)), distribution.model = "ged"),
    "GARCH(2,1)" = ugarchspec(variance.model = list(model = "sGARCH", garchOrder = c(2, 1)),
                              mean.model = list(armaOrder = c(0, 0)), distribution.model = "ged"),
    "GARCH(2,2)" = ugarchspec(variance.model = list(model = "sGARCH", garchOrder = c(2, 2)),
                              mean.model = list(armaOrder = c(0, 0)), distribution.model = "ged"),
    "EGARCH(1,1)" = ugarchspec(variance.model = list(model = "eGARCH", garchOrder = c(1, 1)),
                               mean.model = list(armaOrder = c(0, 0)), distribution.model = "ged"),
    "EGARCH(1,2)" = ugarchspec(variance.model = list(model = "eGARCH", garchOrder = c(1, 2)),
                               mean.model = list(armaOrder = c(0, 0)), distribution.model = "ged")
    )
  results <- data.frame(Model = character(), AIC = numeric(), BIC = numeric(), LogLik = numeric())
  for (name in names(models)) {
    fit <- ugarchfit(spec = models[[name]], data = btc_returns)
    results <- rbind(results, data.frame(
      Model = name,
      AIC = infocriteria(fit)[1],  # AIC
      BIC = infocriteria(fit)[2],  # BIC
      LogLik = fit@fit$LLH         # Log-Likelihood
    ))
  }
  return(results)
}
# Run function on BTC returns
compare_garch_models(btc_returns)

##         Model       AIC       BIC   LogLik
## 1  GARCH(1,1) -4.032964 -4.003335 1593.988
## 2  GARCH(1,2) -4.029726 -3.994172 1593.712
## 3  GARCH(2,1) -4.028795 -3.993240 1593.345
## 4  GARCH(2,2) -4.027261 -3.985781 1593.741
## 5 EGARCH(1,1) -4.034329 -3.998775 1595.526
## 6 EGARCH(1,2) -4.032041 -3.990561 1595.624

For Bitcoin preCOVID, EGARCH (1,1) is selected.

# Specify eGARCH(1,1) model with an ARMA(0,0) process in the mean
# Use GED in the error distribution
garchspec <- ugarchspec(variance.model = list(model = "eGARCH", garchOrder = c(1,1)), 
                        mean.model = list(armaOrder = c(0,0)), distribution.model = "ged")
# Estimation eGARCG (1,1)
garch_fit <- ugarchfit(btc_returns, spec = garchspec)
coef(garch_fit)

##            mu         omega        alpha1         beta1        gamma1 
##  0.0009701827 -0.1673441626 -0.0101978211  0.9751581538  0.2082855555 
##         shape 
##  0.8585957751

This is the estimation result for Bitcoin returns pre-COVID. The interpretation of these coefficients is included in the final report.

(h) Diagnostic Checks on the EGARCH Model

After estimation, we run final diagnostic checks on the standardized residuals and their squares to ensure no remaining autocorrelation or ARCH effects exist.

# Check residuals
garch_resid <- residuals(garch_fit, standardize = TRUE)
# ggtsdisplay(garch_resid)
# Check residuals squared
garch_resid_sq <- garch_resid^2
# ggtsdisplay(garch_resid_sq)
# Ljung-Box test statistic
Box.test(garch_resid, lag = 20)

## 
##  Box-Pierce test
## 
## data:  garch_resid
## X-squared = 24.501, df = 20, p-value = 0.2212

Box.test(garch_resid_sq, lag = 20)

## 
##  Box-Pierce test
## 
## data:  garch_resid_sq
## X-squared = 10.632, df = 20, p-value = 0.9552

# ARCH LM test
ArchTest(garch_resid, lags = 10)

## 
##  ARCH LM-test; Null hypothesis: no ARCH effects
## 
## data:  garch_resid
## Chi-squared = 7.6839, df = 10, p-value = 0.6597

All post-estimation tests on the standardized residuals return insignificant p-values. This confirms that the EGARCH(1,1) model is well-specified and has successfully captured the serial correlation and conditional heteroskedasticity in the Bitcoin return series.